Automatic Evaluation of Topic Coherence

نویسندگان

David Newman

Jey Han Lau

Karl Grieser

Timothy Baldwin

چکیده

This paper introduces the novel task of topic coherence evaluation, whereby a set of words, as generated by a topic model, is rated for coherence or interpretability. We apply a range of topic scoring models to the evaluation task, drawing on WordNet, Wikipedia and the Google search engine, and existing research on lexical similarity/relatedness. In comparison with human scores for a set of learned topics over two distinct datasets, we show a simple co-occurrence measure based on pointwise mutual information over Wikipedia data is able to achieve results for the task at or nearing the level of inter-annotator correlation, and that other Wikipedia-based lexical relatedness methods also achieve strong results. Google produces strong, if less consistent, results, while our results over WordNet are patchy at best.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic keyword extraction using Latent Dirichlet Allocation topic modeling: Similarity with golden standard and users' evaluation

Purpose: This study investigates the automatic keyword extraction from the table of contents of Persian e-books in the field of science using LDA topic modeling, evaluating their similarity with golden standard, and users' viewpoints of the model keywords. Methodology: This is a mixed text-mining research in which LDA topic modeling is used to extract keywords from the table of contents of sci...

متن کامل

Effectiveness of Compassion Therapy on Self-coherence, Post-divorce Adjustment and Negative Automatic Thoughts in Divorced Women

Introduction: Divorce reduces the health of each couple, especially women and one of the treatment methods derived from the third wave of psychotherapy which about that has been done little research is compassion therapy. Therefore, present research aimed to determine the effectiveness of compassion therapy on self-coherence, post-divorce adjustment and negative automatic thoughts in divorced w...

متن کامل

The Sensitivity of Topic Coherence Evaluation to Topic Cardinality

When evaluating the quality of topics generated by a topic model, the convention is to score topic coherence — either manually or automatically — using the top-N topic words. This hyper-parameter N , or the cardinality of the topic, is often overlooked and selected arbitrarily. In this paper, we investigate the impact of this cardinality hyper-parameter on topic coherence evaluation. For two au...

متن کامل

Machine Reading Tea Leaves: Automatically Evaluating Topic Coherence and Topic Model Quality

Topic models based on latent Dirichlet allocation and related methods are used in a range of user-focused tasks including document navigation and trend analysis, but evaluation of the intrinsic quality of the topic model and topics remains an open research area. In this work, we explore the two tasks of automatic evaluation of single topics and automatic evaluation of whole topic models, and pr...

متن کامل

A Database for Automatic Persian Speech Emotion Recognition: Collection, Processing and Evaluation

Abstract Recent developments in robotics automation have motivated researchers to improve the efficiency of interactive systems by making a natural man-machine interaction. Since speech is the most popular method of communication, recognizing human emotions from speech signal becomes a challenging research topic known as Speech Emotion Recognition (SER). In this study, we propose a Persian em...

متن کامل

An Automatic Approach for Document-level Topic Model Evaluation

Topic models jointly learn topics and document-level topic distribution. Extrinsic evaluation of topic models tends to focus exclusively on topic-level evaluation, e.g. by assessing the coherence of topics. We demonstrate that there can be large discrepancies between topicand documentlevel model quality, and that basing model evaluation on topic-level analysis can be highly misleading. We propo...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2010

Automatic Evaluation of Topic Coherence

نویسندگان

چکیده

منابع مشابه

Automatic keyword extraction using Latent Dirichlet Allocation topic modeling: Similarity with golden standard and users' evaluation

Effectiveness of Compassion Therapy on Self-coherence, Post-divorce Adjustment and Negative Automatic Thoughts in Divorced Women

The Sensitivity of Topic Coherence Evaluation to Topic Cardinality

Machine Reading Tea Leaves: Automatically Evaluating Topic Coherence and Topic Model Quality

A Database for Automatic Persian Speech Emotion Recognition: Collection, Processing and Evaluation

An Automatic Approach for Document-level Topic Model Evaluation

عنوان ژورنال:

اشتراک گذاری